Here we model an intervention to address malnutrition among school-aged children in urban Hanoi. The model was developed across several iterative workshops in July 2023. These included decision definition, conceptual model development of the selected decision and an initial programming session with preliminary model results. The resulting model was further developed through August to November 2023. A test run of the intervention will be carried out by the Center for the Development of Organic Agriculture (CODAS) under the Association of Organic Agriculture of Vietnam.

Should urban Hanoi school boards invest time and money in creating school gardens? Should they invest in formal STEM education as part of these gardens?

The gardens will be existing areas on school grounds. The STEM option gardens will be designed to for teaching about nutrition and STEM (an approach to learning and development that integrates science, technology, engineering and maths) at primary and secondary schools. In the 2nd year the garden is expected to start running well. The 3rd year is when the STEM education plan will be fully running.

Methods

We use several different functions to create a working model of the garden intervention.

Chance Event Function: Generating Conditional Values

The chance_event function generates conditional values with:

\[ occurrence = \begin{cases} rbinom(length(value\_if), 1, chance) & if !one\_draw \\ rbinom(1, 1, chance) & if one\_draw \end{cases} \]

and final values with:

\[ chance\_event = occurrence \cdot value\_if + (1 - occurrence) \cdot value\_if\_not \]

Value Varier Function: Generating Synthetic Data

source("functions/vv.R")

The vv function generates synthetic data with varying means and coefficients of variation with:

  • \(n\): Number of data points to generate.
  • \(var\_mean\): Mean value for the data.
  • \(var\_CV\): Coefficient of Variation (CV) for the data.
  • \(distribution\): Distribution of the data (e.g., normal distribution).
  • \(absolute\_trend\): Absolute trend to add to the data.
  • \(relative\_trend\): Relative trend to apply to the data.
  • \(lower\_limit\): Lower limit for generated data.
  • \(upper\_limit\): Upper limit for generated data.

These are used to calculate annual means:

  • If both absolute and relative trends are absent, the annual means remain constant:

    \[ annual\_means = var\_mean \]

  • If an absolute trend is present, annual means are adjusted linearly:

    \[ annual\_means = var\_mean + absolute\_trend \cdot (0, 1, \ldots, n-1) \]

  • If only a relative trend is present, annual means change multiplicatively:

    \[ annual\_means = var\_mean \cdot (1 + relative\_trend / 100)^{(0, 1, \ldots, n-1)} \]

  • Generate random data points based on the specified distribution using calculated annual means:

    \[ out \sim \mathcal{N}(annual\_means, \lvert annual\_means \rvert \cdot \frac{var\_CV}{100}) \]

  • If specified, limit data points to the defined lower and upper limits:

    \[ out = \max(out, lower\_limit), \quad out = \min(out, upper\_limit) \]

Net Present Value (NPV)

source("functions/discount.R")

The discounted value of a time series value \(x_t\) at time \(t\) can be calculated using the formula:

\[ \text{Discounted Value}_t = \frac{x_t}{\left(1 + \frac{\text{Discount Rate}}{100}\right)^{t-1}} \]

Where:

  • \(x_t\) is the value at time \(t\),

  • \(\text{Discount Rate}\) is the discount rate provided to the function,

  • \(t\) is the time period (starting from 1).

  1. Net Present Value (NPV) (if calculate_NPV is set to TRUE): The Net Present Value of a time series with discounted values can be calculated as the sum of all discounted values:

    \[ \text{NPV} = \sum_{t=1}^{n} \frac{x_t}{\left(1 + \frac{\text{Discount Rate}}{100}\right)^{t-1}} \]

    Where \(n\) is the number of time periods in the time series.

Monte Carlo Simulation Function

source("functions/mcSimulation.R")
## 
## Attaching package: 'decisionSupport'
## The following objects are masked _by_ '.GlobalEnv':
## 
##     discount, vv
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE

We use the mcSimulation function to perform Monte Carlo simulations to estimate model outputs based on provided parameters and a model function.

Monte Carlo Simulation Process

The Monte Carlo simulation process generates a set of estimated model outputs based on random input samples, providing a distribution of potential outcomes including:

  • random input generation with \(x_1, x_2, \ldots, x_n\) as a set of \(n\) random input samples drawn from a given distribution, representing the input parameters of the model.

  • a model function \(f(x)\) that maps the input data \(x\) to the corresponding model output \(y\). This can be expressed as:

    \[ y_i = f(x_i), \quad i = 1, 2, \ldots, n \]

  • Estimated model outputs \(\hat{y}_1, \hat{y}_2, \ldots, \hat{y}_n\) obtained by applying the model function to the random input samples:

    \[ \hat{y}_i = f(x_i), \quad i = 1, 2, \ldots, n \]

Urban Hanoi school garden simulation

# Source our model
source("CODAS_Garden_Model.R")
## Warning: Variable: outside_investment_value   distribution: posnorm
## Calculated value of 5%-quantile: 1.77932900574854
##   Target value of 5%-quantile:     1
##   Calculated cumulative probability at value 1 : 0.0289226319595083
##   Target  cumulative probability at value 1 : 0.05
##   Mean scaled difference: 0.4215474
## Warning in paramtnormci_numeric(p = p, ci = ci, lowerTrunc = lowerTrunc, : Calculated value of 5%-quantile: 1.77932900574854
##   Target value of 5%-quantile:     1
##   Calculated cumulative probability at value 1 : 0.0289226319595083
##   Target  cumulative probability at value 1 : 0.05
##   Mean scaled difference: 0.4215474
# Ensure consistent results with the random number generator
# not for each 'run' of the MC simulation but for 
# consistency each time we call on the simulation 
set.seed(1234) 

garden_simulation_results <- mcSimulation(
  estimate = estimate_read_csv("inputs_school_garden.csv"),
  model_function = school_garden_function,
  numberOfModelRuns = 1000, #run 1000 times
  functionSyntax = "plainNames"
)
## Warning: Variable: outside_investment_value   distribution: posnorm
## Calculated value of 5%-quantile: 1.77932900574854
##   Target value of 5%-quantile:     1
##   Calculated cumulative probability at value 1 : 0.0289226319595083
##   Target  cumulative probability at value 1 : 0.05
##   Mean scaled difference: 0.4215474

## Warning: Calculated value of 5%-quantile: 1.77932900574854
##   Target value of 5%-quantile:     1
##   Calculated cumulative probability at value 1 : 0.0289226319595083
##   Target  cumulative probability at value 1 : 0.05
##   Mean scaled difference: 0.4215474

The way we present NPV values can influence decision makers. The same information presented in different ways can even lead to different decisions. Here we derive a plot that compares the decision options as pure NPV values.

source("functions/plot_distributions.R")
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
plot_distributions(mcSimulation_object = garden_simulation_results, 
                                    vars = c("NPV_garden","NPV_garden_STEM", "NPV_no_garden"),
                                    method = 'hist_simple_overlay', 
                                    base_size = 7, 
                                    x_axis_name = "Comparative NPV outcomes")

Framing the outcomes

Under Prospect Theory the way we present NPV values can influence decision makers - the same information presented in different ways can lead to different decisions. For example, framing a projected NPV gain as a “reduction in potential loss” might make it more attractive to decision makers due to loss aversion.

Here we plot the distribution for the decision and frame the projected NPV gain for the ‘decision’ (distributions for the two options with the NPV values of the no garden option subtracted from those for the garden). If we display this as a “Reduction in potential loss” it might be more attractive to decision makers due to loss aversion, i.e. school boards might put more emphasis on avoiding potential losses than on seeking gains. We can frame our results as a strategy that minimizes losses rather than one that maximizes gains.

plot_distributions(mcSimulation_object = garden_simulation_results, 
                                    vars = c("decision", "decision_STEM"),
                                    method = 'hist_simple_overlay', 
                                    base_size = 7,  
                                    x_axis_name = "Reduction in potential loss")

Summary of results for the decision

Summary

Summary of the garden intervention options with gt_plt_summary() from {gtExtras} and with options from {svglite}.

# Subset the outputs from the mcSimulation function (y) to summarize only on the variables that we want.
# names(garden_simulation_results$x)
mcSimulation_summary <- data.frame(garden_simulation_results$x[2:61],
 # names(garden_simulation_results$x)
                                 garden_simulation_results$y[1:7])

gt_plt_summary(mcSimulation_summary) 
mcSimulation_summary
1000 rows x 67 cols
Column Plot Overview Missing Mean Median SD
discount_rate 3.49.4 0.0% 6.5 6.5 0.9
size_of_garden 28123 0.0% 75.2 75.2 14.9
CV_value 0.010.53 0.0% 0.3 0.3 0.1
inflation_rate 2.912.2 0.0% 7.5 7.5 1.5
if_students_like 0.390.90 0.0% 0.6 0.6 0.1
if_parents_like 0.351.00 0.0% 0.7 0.7 0.1
if_community_likes 0.010.98 0.0% 0.5 0.5 0.2
if_effective_manage 0.410.82 0.0% 0.6 0.6 0.1
if_garden_yield_enough 0.160.76 0.0% 0.5 0.5 0.1
if_garden_healthy 0.290.99 0.0% 0.7 0.7 0.1
if_teachers_like 0.031.00 0.0% 0.5 0.5 0.2
if_effective_teaching 0.021.00 0.0% 0.6 0.6 0.2
if_effective_training 0.020.99 0.0% 0.5 0.5 0.2
if_offer_green_space 0.320.99 0.0% 0.7 0.7 0.1
if_reduce_polution 0.080.64 0.0% 0.3 0.3 0.1
if_biophysical_good 0.030.65 0.0% 0.3 0.3 0.1
equipment_cost 31120 0.0% 74.4 74.8 15.6
construction_cost 439 0.0% 22.6 22.6 4.8
garden_designing_costs 7.817.0 0.0% 12.5 12.5 1.6
teacher_training_cost 126 0.0% 12.5 12.7 4.5
school_board_planning 314 0.0% 9.0 9.0 1.8
teaching_equipment 313 0.0% 7.6 7.5 1.5
compost_starting 313 0.0% 7.5 7.5 1.5
worm_starting 0.96.4 0.0% 3.4 3.4 0.9
livestock_costs 0.66.5 0.0% 3.5 3.5 0.9
if_family_pays_establishment 0.100.61 0.0% 0.3 0.4 0.1
establishment_family_portion_paid 0.080.66 0.0% 0.3 0.3 0.1
maintaining_labor 1749 0.0% 32.6 32.4 4.6
teacher_salary_cost 1534 0.0% 24.9 25.0 3.1
teaching_equipment_annual 313 0.0% 7.6 7.6 1.5
teaching_tools 0.66.0 0.0% 3.5 3.5 0.9
seed_costs 0.72.6 0.0% 1.5 1.5 0.3
fertilizer 0.52.5 0.0% 1.5 1.5 0.3
plant_protection 0.76.2 0.0% 3.5 3.6 0.9
livestock_maint 013 0.0% 6.0 6.1 2.4
annual_teacher_training 2.36.1 0.0% 4.0 4.0 0.6
if_school_has_canteen 0.080.61 0.0% 0.3 0.3 0.1
canteen_savings 213 0.0% 7.5 7.5 1.6
sale_of_yield 139 0.0% 20.1 20.1 6.0
extra_cirricular_savings 0150 0.0% 60.9 61.6 24.4
formal_edu_savings 032 0.0% 9.2 8.8 5.8
formal_edu_savings_STEM 0144 0.0% 60.4 60.1 24.6
outside_investment_value 0258 0.0% 33.0 23.4 32.5
outside_investment_value_STEM 0888 0.0% 170.0 125.1 155.6
increased_enrollment_value 028 0.0% 9.3 8.3 6.0
increased_enrollment_value_STEM 0143 0.0% 50.8 49.6 27.2
if_increase_tuition 0.0000.041 0.0% 0.0 0.0 0.0
if_increase_tuition_STEM 0.010.40 0.0% 0.2 0.2 0.1
tuition_increase 213 0.0% 7.5 7.5 1.6
child_veg_access 3.312.2 0.0% 7.5 7.5 1.5
child_healthier_choices 1172 0.0% 58.3 57.2 23.8
child_healthier_choices_STEM 2704 0.0% 298.1 291.7 125.4
green_space_value 51252 0.0% 149.5 151.0 29.4
reduce_polution_value 524 0.0% 15.0 14.9 3.1
school_event_value 073 0.0% 29.6 29.5 12.0
school_event_freq 014 0.0% 6.1 6.0 2.5
value_of_non_garden_land_use 863 0.0% 35.5 35.4 9.3
if_parking 0.010.22 0.0% 0.1 0.1 0.0
parking_value 56256 0.0% 149.2 147.2 30.2
costs_of_non_garden_land_use 0.37.3 0.0% 3.0 2.9 1.2
NPV_garden -603K 0.0% 624.8 554.3 355.1
NPV_garden_STEM 754K 0.0% 1,485.3 1,388.0 649.7
NPV_no_garden 692K 0.0% 327.5 247.4 277.6
decision -1.5K2.4K 0.0% 297.3 293.1 455.5
decision_STEM -9794K 0.0% 1,157.8 1,070.2 711.4
total_costs 267562 0.0% 412.8 412.4 36.3
total_costs_STEM 503816 0.0% 642.7 644.1 43.4

Summary of the savings for the passive education garden option

summary(garden_simulation_results$y$decision)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -1491.97    58.95   293.11   297.26   545.09  2376.83

Summary of the savings for the formal STEM education garden option

summary(garden_simulation_results$y$decision_STEM)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -979.1   685.6  1070.2  1157.8  1548.9  4225.0

Summary of costs

summary(garden_simulation_results$y$total_costs)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   266.6   389.5   412.4   412.8   437.0   562.3
summary(garden_simulation_results$y$total_costs_STEM)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   503.1   611.8   644.1   642.7   670.6   816.3

First year

summary(garden_simulation_results$y$Cashflow_garden1)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -134.073  -33.497    6.993   16.995   59.232  383.734
summary(garden_simulation_results$y$Cashflow_garden_STEM1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -135.31   75.28  148.53  169.54  239.58  715.86

Cashflow of the garden option without formal STEM education

source("functions/plot_cashflow.R")
plot_cashflow(mcSimulation_object = garden_simulation_results, 
              cashflow_var_name = "Cashflow_garden")
## Warning in FUN(X[[i]], ...): NAs introduced by coercion

## Warning in FUN(X[[i]], ...): NAs introduced by coercion

## Warning in FUN(X[[i]], ...): NAs introduced by coercion

## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning: Removed 5 rows containing missing values (`geom_line()`).
## Removed 5 rows containing missing values (`geom_line()`).

Cashflow of the garden option with formal STEM education

source("functions/plot_cashflow.R")
plot_cashflow(mcSimulation_object = garden_simulation_results, 
              cashflow_var_name = "Cashflow_garden_STEM")

Expected Value of Perfect Information (EVPI)

Here we assess value of information with the multi_EVPI function.

# Subset the outputs from the mcSimulation function (y) by selecting the correct variables be sure to run the multi_EVPI only on the variables that we want. Find them with names(garden_simulation_results$y)
mcSimulation_table <- data.frame(garden_simulation_results$x, 
                                 garden_simulation_results$y[1:5])

Value of information for the garden option (no STEM).

source("functions/multi_EVPI.R")
# first_out_var is the first result variable in the table, "NPV_garden" in our case.
evpi <- multi_EVPI(mc = mcSimulation_table, first_out_var = "NPV_garden")
## [1] "Processing 5 output variables. This can take some time."
## [1] "Output variable 1 (NPV_garden) completed."
## [1] "Output variable 2 (NPV_garden_STEM) completed."
## [1] "Output variable 3 (NPV_no_garden) completed."
## [1] "Output variable 4 (decision) completed."
## [1] "Output variable 5 (decision_STEM) completed."
source("functions/plot_evpi.R")
plot_evpi(evpi, decision_vars = "decision")

Value of information for the garden option with formal STEM education.

# using the results of the same multi_EVPI
plot_evpi(evpi, decision_vars = "decision_STEM")
## Warning: There are no variables with a positive EVPI. You probably do not need
## a plot for that.

PLS

We use Projection to Latent Structures model to get some sense of the correlation strength and direction for model variables and our outcome variables.

For passive education garden option

source("functions/pls_model.R")
pls_result <- pls_model(object = garden_simulation_results,
                                resultName = names(garden_simulation_results$y)[1], # the "NPV_garden" 
                                ncomp = 1)
# read in the common input table
input_table <- read.csv("inputs_school_garden.csv")
# source the plot function
source("functions/plot_pls.R")
plot_pls(pls_result, input_table = input_table, threshold = 0.9)

For school garden with formal STEM education

pls_result_STEM <- pls_model(object = garden_simulation_results,
                                resultName = names(garden_simulation_results$y)[2], # the "NPV_garden_STEM" 
                                ncomp = 1)

plot_pls(pls_result_STEM, input_table = input_table, threshold = 0.9)

The full repository can be accessed at https://github.com/CWWhitney/nifam_codas_school_garden with the following QR code.